23.13 Data Analysis

333

Metabolic microarrays operate on the same principle as other kinds of microarrays

(Sect. 18.1) in which large numbers of small molecules are synthesized, typically

using combinatorial or other chemistry for generating high diversity. The array is then

exposed to the target, whose components of interest are usually labelled (although

their chemical diversity makes this more difficult than in the case of nucleic acids, for

example; moreover, the small size of metabolites makes it more likely that the label

chemically perturbs them). This technique can be used to answer questions such as

“to which metabolite(s) does macromolecule X bind?”

Much ingenuity is currently being applied to determine spatial variations in

selected metabolites. An example of a method developed for that purpose is PEB-

BLES (probes encapsulated by biologically localized embedding): fluorescent dyes,

entrapped inside larger cage molecules, which respond (i.e., change their fluo-

rescence) to certain ions or molecules. Their spatial location in the cell can be

mapped using fluorescence microscopy. Another example is the development of

high-resolution scanning secondary ion mass spectrometry (“nanoSIMS”), whereby

a focused ion beam (usually CsSuperscript plus+ or OSuperscript minus) is scanned across a (somewhat conducting)

sample and the secondary ions released from the sample are detected mass spec-

trometrically with a spatial resolution of some tens of nanometres. This method is

very favourable for certain metal ions, which can be detected at mole fractions of

as little as 10 Superscript negative 6106. If biomolecules are to be detected, it is advantageous to mark the

molecule or molecules of interest by enriching them with rare but stable isotopes

of their constituent atoms (e.g., Superscript 1515N, whose natural abundance is typically less than

1%); the marked molecules can then easily be distinguished via the masses of their

fragments in the mass spectrometer. It is usually safe to assume that the physiological

effect of such marking is small. 40

As far as whole bodies are concerned, the blood is an extremely valuable organ

to analyse, since its composition sensitively depends on the state of the organism, to

the extent that blood is sometimes called “the sentinel of the body”.

23.13 Data Analysis

The first task in metabonomics is typically to correlate the presence of metabolites

with gene expression. One is therefore trying to correlate two datasets, each con-

taining hundreds of points, with each other. This in essence is a problem of pattern

recognition (Sect. 13.1). There are two categories of algorithms used for this task:

unsupervised and supervised.

The unsupervised techniques determine whether there is any intrinsic clustering

within the dataset. Initial information is given as object descriptions, but the classes

to which the objects belong are not known beforehand. A widely used unsupervised

technique is principal component analysis (PCA, see Sect. 13.2.2). Essentially, the

original dataset is projected onto a space of lower dimension; for example, a set of

40 See Voigt and Matt (2004) for some insight into this question.